1) The file euroroad.txt contains the links of the international E-road network (https://en.wikipedia.org/wiki/International_E-road_network) between the European cities given in the file euroroad_city_names.txt. Each node in this network represents a city and an edge between two nodes denotes that they are connected by an E-road without any other city in-between.
df = read.csv("datasets/euroroad.txt", sep = " ")
df_n = read.csv("datasets/euroroad_city_names.txt", sep = "")
g = graph_from_data_frame(df, directed = TRUE)a) Provide a statistical summary of it. What are its most central cities?
Datos estadísticos del grafo:
Order: 1173.
Tamaño: 1416
Densidad: 0.001
Componentes: El número de componentes conexos es 26 y sus tamaños son 38, 1039, 3, 4, 8, 5, 2, 15, 3, 4, 5, 4, 4, 2, 2, 2, 2, 7, 2, 2, 2, 10, 2, 2, 2, 2
Diámetro: 41
Distancia media: 8.782487
Transitividad: 0.0339
Obtenemos las ciudades más centrales mediante la centralidad Betweeness de la siguiente forma:
r = round(gorder(g) * 0.01)
c_names = names(round(igraph::betweenness(g),4)[1:r])
most_central = c()
for(i in c_names){
most_central = c(most_central, as.character(df_n[i,1]))
}
most_central## [1] "Preston" "Birmingham" "Southampton" "Havre" "Paris"
## [6] "Orléans" "Bordeaux" "San" "Sebastián" "Burgos"
## [11] "Madrid" "Seville"
De esta manera se muestra las ciudades por donde se ha de pasar más para llegar a las demás ciudades.
b) How do you think a road network is formed?
c) Is this road network compatible with the E-R model, the configuration model, or the basic Barabasi-Albert’s model? Justify your answer, and comment whether you expected it.
Obtener orden, tamaño y densidad de \(g\):
o_g = gorder(g)
s_g = gsize(g)
den_g = edge_density(g, loops = FALSE)E-R Model:
G_ER = sample_gnp(gorder(g), den_g, directed=FALSE)
plot(G_ER, vertex.label=NA, vertex.shape="sphere", vertex.size=10, edge.color="black", vertex.color="green", edge.arrow.size=0.3)N = 10000
sample = replicate(N, transitivity(sample_gnp(gorder(g), den_g, directed=TRUE)))sum(which(sample > transitivity(g))) / N # p-value## [1] 0
round(c(quantile(sample, 0.025), quantile(sample, 0.975)), 5) # 95% IC## 2.5% 97.5%
## 0.00000 0.00518
Dado que \(p-value < \alpha\), rechazamos \(H_0\) con un nivel de significación al 0.05. Por lo tanto, la red de carreteras no es compatible con el modelo E-R.
Configuration Model:
G_CM = sample_degseq(out.deg = degree(g, mode="out"), in.deg = degree(g, mode = "in"), method = "simple")
G_CMs=simplify(G_CM)
plot(G_CMs, vertex.label=NA, vertex.shape="sphere", vertex.size=10, edge.color="black", vertex.color="green", edge.arrow.size=0.3)N = 10000
sample = replicate(N, transitivity(sample_degseq(out.deg = degree(g, mode="out"), in.deg = degree(g, mode = "in"), method = "simple")))sum(which(sample > transitivity(g))) / N # p-value## [1] 0
round(c(quantile(sample, 0.025), quantile(sample, 0.975)), 5) # 95% IC## 2.5% 97.5%
## 0.00000 0.00425
Igual que con el modelo E-R, dado que \(p-value < \alpha\), rechazamos \(H_0\) con un nivel de significación al 0.05. Por lo tanto, la red de carreteras tampoco es compatible con el modelo de configuración.
Barabasi-Albert Model:
G_BA_m1 = sample_pa(n = gorder(g), m = 1, directed = TRUE, out.pref = TRUE)
G_BA_m2 = sample_pa(n = gorder(g), m = 2, directed = TRUE, out.pref = TRUE)
G_BA_m3 = sample_pa(n = gorder(g), m = 3, directed = TRUE, out.pref = TRUE)
plot(G_BA_m1, vertex.shape="sphere", vertex.size=10, vertex.label.cex=0.5, edge.color="black",
vertex.color=c(rep("red", 10), rep("blue", 10), rep("green",70), rep("yellow",10)), edge.arrow.size=0.3)plot(G_BA_m2, vertex.shape="sphere", vertex.size=10, vertex.label.cex=0.5, edge.color="black",
vertex.color=c(rep("red", 10), rep("blue", 10), rep("green",70), rep("yellow",10)), edge.arrow.size=0.3)plot(G_BA_m3, vertex.shape="sphere", vertex.size=10, vertex.label.cex=0.5, edge.color="black",
vertex.color=c(rep("red", 10), rep("blue", 10), rep("green",70), rep("yellow",10)), edge.arrow.size=0.3)N = 10000
sample = replicate(N, transitivity(sample_pa(n = gorder(g), m = 1, directed = TRUE, out.pref = TRUE)))sum(which(sample > transitivity(g))) / N # p-value## [1] 0
round(c(quantile(sample, 0.025), quantile(sample, 0.975)), 5) # 95% IC## 2.5% 97.5%
## 0 0
Igual que con el modelo E-R, dado que \(p-value < \alpha\), rechazamos \(H_0\) con un nivel de significación al 0.05. Por lo tanto, la red de carreteras tampoco es compatible con el modelo de configuración.
d) If we add an initial attractiveness \(a\) to all nodes, so that the probability of an old node \(v\) to be linked by a new node is proportional to \(a+\deg(v)\), the absolute value of the exponent in the power law describing its degree distribution turns out to be \(3+a/c\) (with \(c\) the mean number of links of the new nodes). This exponent can be estimated with igraph’s function fit_power_law.
Now, is the E-road network compatible with Barabasi-Albert’s model with initial attractiveness? Justify your answer, and comment whether you expected it.
2) Consider the WikiVote directed network introduced in the Handout 4.
a) Is it compatible with the directed E-R model, the directed configuration model, or the directed Barabasi-Albert’s model (either basic or with initial attractiveness)? Justify your answer, and comment whether you expected it.
b) It should be clear that this network does not “grow” and that the number of votes received by each individual is somehow related to some notion of “fitness”. igraph provides some functions to generate random networks with these features (based on the model explained in https://arxiv.org/pdf/cond-mat/0106565.pdf). Is this WikiVote network compatible with the “Scale-free random graphs with vertex fitness scores” provided by the function sample_fitness_pl? Justify and comment your answer.